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Method for computer-aided generatio n of p rognoses for 
operative systems, and a syste m for the generation of 
prognoses for operative systems 



The invention relates to a method for computer-aided 
generation of prognoses for operative systems, in 
particular for control processes and the like, on the 
basis of multidimensional data records describing the 
state of a system, product and/or process, and applying 
the SOM method in which an ordered grid of nodes 
representing the data distribution is determined. 

Furthermore, the invention relates to a system for the 
generation of prognoses for operative systems, in 
particular for control processes, on the basis of 
multidimensional data records describing a state of a 
system, product and/or process, having a database for 
storing the data records, and having an SOM unit for 
determining an ordered grid of nodes representing the 
data distribution. 

Numerous control techniques in operative systems, for 
example in the case of industrial application, or else 
the automation of marketing measures as far as 
financial trading systems are based on automatic units 
for the generation of prognoses of specific parameters 
of features, quality or systems. The accuracy and 
reliability of such prognosis units is for the most 
part an essential precondition for the efficient 
functioning of the entire control. 

The implementation of the prognosis models therefor is 
frequently performed on the basis of classical 
statistical methods (so-called multivariant models) . 
However, the relationships that should be recorded in 
the basic prognosis models are frequently of a 
nonlinear nature. The conventional statistical methods 
on the one hand cannot be directly applied for these 
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prognosis models, and on the other hand can be 
automated only with difficulty as nonlinear statistical 
extensions . 

5 Consequently, in order to model nonlinear dependences 
recourse has been made in part to methodological 
approaches from the field of artificial intelligence 
(genetic algorithms, neural networks, decision trees 
etc.) that promise a better exhaustion of the 

10 information in nonlinear relationships. Prognosis 

models that are based on these methods are scarcely 
used, for example, in automated systems because their 
efficiency and stability and/or reliability generally 
cannot be ensured. One reason for this is the absence 

15 of statistically reliable statements on the limits of 
the efficiency and validity of black box models, that 
is to say in problems relating to overfitting, 
generalizability, explanation components etc. 

20 The present technique is based on the use of the so- 
called SOM (SOM - Self-Organizing-Maps) method. This 
SOM method, which is used as a basis for nonlinear data 
representations, is well known per se, compare 
T. Kohonen, ^'Self -Organizing Maps'', 3rd. edition, 

25 Springer Verlag Berlin 2001. Self -organizing maps 

constitute a non-parametric regression method by means 
of which data of any desired dimension can be mapped 
into a space of lower dimension. The original data are 
abstracted in the process. 

30 

The most commonly used method for data representation 
or else for visualization in the case of the SOM method 
is based on a two-dimensional hexagonal grid of nodes 
for representing the SOM. Starting from a number of 
35 numerical multivariant ■ data records, the nodes of the 
grid are continuously adapted to the form of the data 
distribution during an adaptation operation. Because of 
the fact that the arrangement of the nodes among one 
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another reflects the neighborhood inside the data 
volume, features and properties of the data 
distribution can be read directly from the ensuing 
'^landscape''. The resulting ^'map'' constitutes a 
5 representation of the original data distribution that 
retains local topology. 

The following example can be produced to explain the 
SOM method: 

10 

There are 1000 persons on a football pitch who are 
randomly distributed on the playing area. 10 features 
(for example sex, age, body size, income etc.) are now 
defined which are to be used to intercompare all the 

15 1000 persons. They converse and exchange places until 
each of them is surrounded by persons who is most 
similar to him/her with reference to the defined 
comparative properties. A situation is thereby reached 
in which each of the participants is most similar to 

20 his immediate neighbor with reference to the totality 
of the features. 

This renders plain how it is possible to come to a two- 
dimensional representation despite the 

25 multidimensionality of the data. With this distribution 
of the persons on the playing field, it is now possible 
to represent each of the features two-dimensionally 
(for example in a color-coded fashion) . In this case, 
the color range of the values reaches from blue 

30 (lowest-level expression of the feature) to red 

(highest-level expression of the feature) . If all the 
features are visualized in this way, a colored map is 
obtained from which the distribution of the respective 
features, that is to say variables, can be detected 

35 visually. It is to be noted in this case that 

irrespective of the feature considered a person (or a 
data record) is positioned at exactly one site on the 
football pitch. 
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Further features can also be associated with a finished 
SOM; in this case, features of the data records that 
are not taken into account when calculating the SOM are 
5 represented graphically just like features that have 
been included in the SOM. The distribution of the data 
records within the SOM no longer changes in this case. 

One application of SOM is described in WO 01/80176 A2, 
10 in which the aim is pursed of dividing a total data 
volume into partial data volumes in order then to 
calculate prognosis models on them. However, the aim 
here is to raise the performance of the calculation by 
distributing the computing load over a number of 
15 computers. Although this method is also based in part 
on SOMs, this is not for the purpose of optimizing the 
quality of prognosis, but (first and foremost) for the 
purpose of shortening the calculating time through the 
distributed computation and the subsequent combination 
20 of the individual models. The method of prognosis used 
in this case is based, in particular, on the so-called 
Radial Basis Function (RBF) networks that are 
associated with a special SOM variant that optimizes 
the entropy of the SOM representation. 

25 

Furthermore, another application of the SOM method is 
known from DE 197 42 902 Al, specifically in the 
planning and carrying out of experiments, although here 
the aim is specifically a process monitoring with the 
30 use of SOM without any sort of prognoses. 

It is an object of the invention to provide a method 
and a system of the type presented at the beginning 
with the aid of which it is possible to achieve a high 
35 efficiency and an optimization of the accuracy of the 
prognoses in order thus to enable a high level of 
efficiency of the control application based thereon in 
the respective operative system; it is aimed as a 
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consequence to be able thereby to obtain products of 
higher quality in fabrication processes, for example. 

The method according to the invention and of the type 
5 presented at the beginning is characterized in that in 
order to take account of nonlinearities in the data an 
internal scaling of variables is undertaken on the 
basis of the nonlinear influence of each variable on 
the prognosis variable, in that local receptive regions 

10 assigned to the nodes are determined on the basis of 
which local linear regressions are calculated, and in 
that optimized prognosis values for controlling the 
operative system are calculated with the aid of the set 
of local prognosis models that is thus obtained, this 

15 being done by determining the respectively adequate 
node for each new data record and applying the local 
prognosis model to this data record. 

In a corresponding way, the system according to the 
20 invention and of the type specified at the beginning is 
characterized in that the SOM unit is assigned a 
nonlinearity feedback unit for the internal scaling of 
variables in order to compensate its nonlinear 
influence on the prognosis variable, as well as a 
25 calculation unit for determining local linear 

regressions on the basis of local receptive regions 
assigned to the nodes, optimized prognosis values being 
calculated in a prediction unit on the basis of the 
local prognosis models thus obtained, this being done 
30 by determining the respectively adequate node for each 
new data record and applying the local prognosis model 
to this data record. 

In accordance with the invention, the data space is 
35 therefore firstly decomposed into microclusters , and 
thereafter an optimum zone which is respectively as 
homogeneous as possible is determined about these 
clusters for the regression. Different local 
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regressions are subsequently calculated in all these 
zones and are then applied individually for each data 
record for which it is intended to calculate a 
prognosis, depending respectively on the microcluster 
5 in which it comes to lie or to which it belongs. 

The particular efficiency of the present prognosis 
technique is consequently achieved by the adaptation of 
classical statistical methods such as regression 

10 analysis, principal component analysis, cluster 

analysis to the specific facts of SOM technology. With 
the local linear regression, the statistical regression 
analysis is respectively applied only to a portion of 
the data, this portion being determined by the SOM, 

15 that is to say by the ^^neighborhood'' in the SOM map. It 
is possible within this subset to generate a regression 
model that is substantially more specific than a single 
model over all the data. Many local regression models 
with overlapping data subsets are generated overall for 

20 a prognosis model. It is always only the '"closest" 

model that is used in determining a prognosis value. 

The present technique therefore combines the capacity 
of the self organizing maps (SOMs) for nonlinear data 

25 representation with the calculation of the multivariant 
statistics, in order to raise the efficiency of the 
prognosis models, and to optimize the use of 
differentiated, distributed prognosis models in 
automated control systems- The difficulties of the 

30 known proposed solutions are overcome in this case by 
departing from a purely methodological approach. The 
function of integrated prognosis models, in particular 
their automated application in control processes - is 
decomposed into individual action areas that are 

35 detached independently and finally joined in a novel 
fashion into a functional whole. 



In a departure from the prior art, the invention also 
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takes account of the circumstance that individual 
variables can have a different, nonlinear influence on 
the prognosis variable; in order to take account of 
these nonlinearities in the data, and to provide an at 
5 least far reaching compensation therefor, a 

nonlinearity analysis is carried out on the basis of a 
global regression in conjunction with local prognosis 
models, nonlinearity measures being derived from which 
scaling factors for internal scaling are determined in 
10 order to take account of the given nonlinear 

relationships. The optimized SOM representation is 
generated after this internal scaling has been carried 
out - 

15 It is of particular advantage in this connection when 
for each variable a dimension is formed for its order 
in the SOM representation and a dimension is formed for 
its contribution to the explained variance, new 
internal scalings being determined from these 

20 dimensions on the basis that the estimated change in 
the explained variance is maximized by varying the 
internal scalings, as a result of which the variables 
are ordered in the resulting SOM representation in 
accordance with their contributions to the explained 

25 variance and so that existing nonlinearities are more 
accurately resolved. 

A certain margin that is bounded by the required 
significance, on the one hand, and by the necessary 

30 stability, on the other hand, is present during the 

determination of the respective receptive regions (or 
receptive radii, which define these regions) • Within 
these bounds, it is possible to find an optimum 
receptive region for which the variance of the residues 

35 is minimal. According to the invention, it is therefore 
advantageous in particular when the receptive regions 
assigned to the nodes are being determined, if their 
magnitude is respectively selected to be so large that 
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the explained variance of the local regression is 
maximal in conjunction with simultaneous safeguarding 
of significance and stability in the region of the 
node. It is particularly advantageous in this case when 
5 the receptive regions assigned to the nodes are being 
determined, if it is in each case the smallest 
necessary receptive region that is selected for the 
significance of the regression, and the largest 
possible receptive region that is selected for 
10 maximizing the accuracy of prognosis . 

It has also proved to be advantageous when the internal 
scaling is carried out iteratively. 

15 It is advantageous, furthermore, according to the 
invention when the supplied data are subjected in 
advance to a compensating scaling in order at least 
partially to compensate any possible correlations 
between variables. Starting values that can be used 

20 effectively are obtained in this way for the further 

processing. It has proved to be an advantageous mode of 
procedure in this case when the individual data records 
are rescaled for the purpose of the compensating 
scaling, the values of a respective variable of all the 

25 data records being standardized, after which the data 
are transformed into the principal component space and 
the compensating scalings of the individual variables 
are calculated on the basis that the distance measure 
in the original variable space differs minimally from 

30 the distance measure in the standardized principal 

component space. Furthermore it is consequently also 
advantageous for the purpose of simplifying the method 
when the compensating scaling is multiplicatively 
combined with the internal scaling, which takes account 

35 of the nonlinearities in the data, in order to form a 
combined variable scaling on which an SOM 
representation modified in accordance therewith is 
based. 
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Advantageous for the respective process control is a 
special embodiment of the system according to the 
invention that is characterized in that connected to 
5 the prediction unit are a number of control units that 
are assigned to individual process states and predict 
the process results that would arise for the current 
process data. 

10 It is also advantageous here when respectively 

separately assigned process units for deriving control 
parameters on the basis of the predicted process 
results and of the desired values for the process 
respectively to be carried out in the operative system 

15 are connected to the control units. 

The invention is explained in yet more detail below 
with the aid of particularly preferred exemplary 
embodiments, to which, however, it is not intended to 
20 be limited, and with reference to the drawing, in 
which: 

figure 1 shows a schematic, in the form of a block 
diagram, of a system for the generation of prognoses, 
25 the cooperation of the individual components of this 
prediction system being illustrated, in particular; 

figure 2 shows a schematic of individual system modules 
in more detail; 

30 

figure 3 shows a flowchart for illustrating the mode of 
procedure in the case of the method according to the 
invention; 

35 figure 4 shows a diagram for illustrating the mean 
range as a function of the receptive radius, for 
different variables; 
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figure 5 shows a schematic of one dimension of a 
receptive region for a local linear regression; 

figures 6 and 7 show two diagrams for the nonlinear 
5 measure of determination or the estimated error as a 
function of the receptive radius for the purpose of 
determining the optimal receptive radius; 

figure 8 shows a schematic illustration of the system 
10 according to the invention in an application for a 
process control, in a type of block diagram; 

figure 9 shows in the partial figures 9A, 9B and 9C, 
SOM representations for different variables in an 
15 exemplary continuous steel casting process; 

figure 10 shows in the partial figures lOA, lOB and IOC 
corresponding SOM maps after a second iteration step 
has been run through; 

20 

figure 11 shows the SOM representation for one of the 
variables after a second iteration step, the ordering 
of the data (figure llA) , the nonlinear influence 
(figure IIB) and the distribution of the receptive 
25 radii (figure IIC) being shown; and 

figure 12 shows a diagram that illustrates the change 
in the parameters on the basis of the iterations. 

30 It is known that data may be illustrated in the SOM 
illustration such that it is possible for specific 
properties of the data distribution to be seen 
immediately from the SOM map. For the purpose of 
visualization, in this case the SOM map contains a grid 

35 of nodes ordered according to prescribed rules, for 
example in hexagonal form, the nodes of the grid 
representing the respective microclusters of the data 
distribution. An example of this is illustrated in the 
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subsequent figures 9, 10 and 11, which are explained in 
more detail. 

In the course of the present method, large data volumes 
5 are now compressed in the SOM representation such that 
the nonlinear relationships in the representation are 
retained. As a result, those data sectors 
(microclusters ) which contain the information relevant 
to the modeling can be selected individually and 
10 independently. The extremely short access times to 
these data sectors enable a substantially 
differentiated subdivision of the database, and thereby 
a targeted used of the included nonlinearities for the 
generation of the model. 

15 

The combination of the statistical calculus with 
suitably selected data sectors consequently permits 
information present in the nonlinear relationships to 
be used in conjunction with safeguarding of statistical 
20 requirements relating to quality and significance. The 
selection of the local data sectors, that is to say the 
receptive regions, is optimized in this case for 
obtaining prognosis models that are as efficient as 
possible . 

25 

A set of all the optimized local regression models can 
be used to make a statement as to how far the 
fundamental data representation is suitable for 
representing the nonlinear relationships of the 

30 variables to the target variable (nonlinearity 

analysis) . The representation parameters of the SOM 
data compression (that is to say internal scalings) can 
be optimized therefrom in an iterative step so as to 
obtain an improved resolving power for the 

35 nonlinearities, and this leads in consequence to local 
prognosis models that are more accurate. 

The particular type of SOM data representation then 
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permits the visualization of all the local model 
parameters in an image. The safeguarding of the 
validity and efficiency of the entire prognosis model 
is simplified, accelerated and improved by the 
5 simultaneous comparison of parameters relevant to 
quality - 

The prognosis model as a whole comprises the set of all 
the local prognosis models, which are to be regarded as 

10 logically or physically distributed. In the operational 
mode of the prognosis model, each new data record is 
firstly assigned to that microcluster which is closest 
to it. Thereupon, the local prognosis model of this 
microcluster is applied to the data record, and the 

15 prognosis result obtained is fed to the - preferably 
local - control or processing unit. 

Specific SOM data representation or data compression 
occupies a central position in the present method. The 

20 historical process data stored in accordance with the 
illustration in figure 1 in a database 1 serve the 
purpose of SOM generation, carried out in an SOM unit 2 
inside a prediction unit- 3, in a first iteration step 
of the method. On the basis of this SOM, newly 

25 calculated scalings are fed back, as a result of a 

nonlinearity analysis carried out in a unit 4, to the 
SOM unit 2, that is to say to the data representation, 
in a second iteration step. These scalings optimize the 
SOM data representation with regard to taking optimum 

30 account of nonlinear relationships in the data for the 
prediction over the local data sectors, as will be 
explained in yet more detail below. 

The generation of local linear regression models is 
35 performed in a calculating unit 5 by taking account of 
a receptive radius that is selected for the respective 
regression model in an optimum fashion with regard to 
the prognosis quality. The receptive radius is used to 
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determine how many data records from the environment of 
a microcluster are used for the regression. The larger 

the radius, the more that data records from the 
surrounding nodes are used: all the data records are 
5 used when the radius turns to '^infinity''. The more 
distant nodes have a lesser influence because of 
Gaussian weighting functions, that are preferably used 
in this case. 

10 The totality of all the local linear regression models 
over the data sectors in combination with the SOM 
constitutes the optimized prognosis model. This overall 
model can be represented optically by means of a 
visualization unit 6 and, as explained below in more 

15 detail with the aid of figure 8, it can, if 

appropriate, be distributed over individual control 
subunits and used for the purpose of generating from 
current process data for the respective control units 
specific prognoses with regard to the process results 

20 that are then used to control these process units. 

For the sake of simplicity, figure 1 illustrates only a 
general control unit 7 that is connected to a general 
process unit 8. The process data transmission, which is 

25 performed in real time for the purpose of application 
to current process data, is illustrated by an arrow 9, 
and arrow 10 indicates the flow of control data; 
finally, arrows 11, 12 illustrate the feeding of 
current process data to the respectively preceding 

30 units. 

The cooperation of the individual system components is 
illustrated in detail in figure 2 for the purpose of 
explanation. It is to be seen here that the SOM unit 2, 
35 which is provided for the representation and 

compression of data, is connected via a coil 13 of the 
prediction unit 3 to the other units such as, in 
particular, the nonlinearity feedback unit 4, from 
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where the results of the local modeling are fed back to 
the data representation in order then to generate in 
the calculation unit 5 the optimized linear regression 
models over local data sectors. The visualization unit 
5 6 then displays the SOM map thus generated, and also 
permits visual monitoring. 



Figure 3 is a schematic of the sequence of the 
technique according to the invention, block 14 

10 illustrating the data archiving and prescription of 
target data. A global regression and/or residues are 
calculated in a way known per se on the basis of these 
data in a first step (see block 15 in figure 3) , after 
which internal scalings for obtaining the SOM 

15 representation are determined in accordance with block 
16. 

In detail/ each data item based prognosis proceeds from 
a distribution of raw data that consists of K points 
20 x° j (where k = l...k) , each point having j components 

(where j = 1...L) . The prognosis is focused on a target 
variable y^ that is in general a nonlinear function of 
the points x°.. and is a random variable in the 

statistical sense. In the present technique, the 
25 variables x°^j (the index k being omitted below for the 

sake of simplicity) having the variance 
of =Var (x°) 

Are first standardized and then (in accordance with 
30 step 16 in figure 3) scaled with new factors in 

accordance with the following relationship, these 
factors being termed internal scalings Oji the 
variables used below are therefore 




The covariance matrix C of the scaled variables Xn can 
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always be diagonalized by an orthogonal matrix Aiq 

(where I = 1...L and q = 1...Q) : 

1 ^ 

Cij = S^kj'^kj' which case it holds that 

K - 1 k = i 

c = a-c^^^^-a'" 

5 and^ for the eigenvalues Eq, that 

Eg = (C^'"^)qq. 

Moreover, the covariance matrix C can be decomposed as 
C = B-B, where By= J^A.^^E^^A^ . 

q=l 

10 

The components Xj of the data vector x are transformed 
into the principal component space by means of the 
transformation matrix Aiqi 

L 

Xq=^Ajq'Xj, where q = 1...Q the number of the principal 
j = i 

15 components, 

A calculation aimed at the SOM data representation is 
now performed in accordance with block 17 in figure 3. 

20 The generation of an SOM is performed in a way known 
per se using the Kohonen algorithm (Teuvo Kohonen^ 
Self -Organizing Maps, Springer Verlag 2001) . The 
nonlinear representation of the data distribution 
^k=^k,j '^y SOM is in this case essentially a function 

25 of the internal scalings aj of the variables Xj. Thus, 
multiplying the internal scalings Qj with freely 
determinable factors TCj changes the data representation, 
which is yielded from the new scalings, and 
specifically in accordance with a^=o^'7r^. 

30 

The SOM data representation can be used to define 
subregions of data. If an SOM consists of N nodes with 
representing vectors , where 1 = 1...N, a subset of 

data can be selected by virtue of the fact that it lies 
35 inside a receptive radius r outside a specific node 1: 
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{x^^}: I x^^ - I = min and 1'g Un(l), ki = 1...K, 
where 

= representing vector of the node 1' , and 
Ur = {IM... environment of the node 1, in which it holds 
5 that: l|l~l' ll<r. 

The individual variables Xj are resolved with different 
degrees of effectiveness in a given SOM data 
representation. In the present method, the order of the 
10 SOM with reference to the variables Xj is described for 
a prescribed, receptive radius r by the mean range ?ij : 

A^{r) = -^^^ where Sj: = aj(K-l) and 
4(r): = |:a--.(K,-l)-^, 

1 = 1 ^1 

in which case 
15 Hi is the number of data records in the node 1, 

Oj^^^ is the variance of the variables Xj in the local 

data volume 
KJ and 

— is a weighting factor for the node 1. 
^1 

20 

Illustrated in a diagram in figure 4 by way of example 
is the square of the mean range X^(r) as a function of 
the receptive radius r for a number of variables V, K 
and T, the fundamental example, explained in more 

25 detail below, here being that of a continuous steel 
casting in the case of which it is assumed that the 
target variable of ^'tensile strength" is a function of 
the parameters of strand removal rate V, removal 
temperature T and concentration K of chromium in the 

30 alloy composition, and forecasts relating to the steel 
quality (more precisely the tensile strength) are to be 
made on the basis of V, T and K data. 

Given a fixed receptive radius ri, it is clear that it 
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holds for the range value - arranged over all the 
nodes - that: 

— > 0 ... complete order of the SOM within the receptive radius 
5 ri as regards the variable Xj, and 



X.j — > 1 ... complete loss of information for local regression in 

terms of the variables Xj within the receptive radius ri 
with reference to global nonlinearities in Xj . 

10 

In order to obtain as balanced as possible an SOM as 
starting point for the following steps, internal 
scalings can preferably be determined by a method that 
is suitable for compensating any correlations in the 
15 data distribution. 



These compensating factors Hj^"*^ for each variable j are 

calculated such that the distance measure in the given 
data space comes as close as possible to the distance 
20 measure in the standardized principal component space 
(Mahalanobis distance) . This is fulfilled when: 



25 As an alternative to these factors, or in addition 

thereto, starting values for the scalings can also be 
used from preceding univariate nonlinearity analyses of 
the residues. 

30 A regression of all K data points to the target 

variable y is denoted as global regression (compare 
step 15 in figure 3) . The estimated regression 
coefficients Po/ Pj for the estimator y of the target 

variable y, where 
35 y, = Po + Pj -Xk.j. 

are calculated on the basis of covariance matrix C in a 
conventional way (compare, for example, the so-called 



wo 2004/029738 



- 18 - 



PCT/AT2003/000289 



stepwise regression method or the complete regression 
method) . 

The residues Uk of the global regression are yielded as 
5 Uk = Yk - Yk - 

On the basis of an SOM representation, a local 
regression to the residue Uj^^ can now be calculated for 

each subset of data points {xj^^(r^)} that lies inside a 

10 receptive radius ri around the node 1 - compare step 18 
in figure 3. If there is a nonlinear relationship 
between the target variable y and the variables Xj, the 
SOM representation was generated independently of the 
target variable y, and the local regression is 

15 significant with reference to the variables Xj, it is 
possible for a portion of the scattering (which has 
remained unexplained in global terms) in the residue u 
to be explained. 

20 A simplified example for such a local linear regression 
is shown in figure 5, where a multiplicity of data 
points and a total regression curve - not denoted in 
more detail - are shown, it being evident that the 
receptive radius r, which defines the receptive region 

25 for the regression, can be fixed between a minimum r^in 
and a maximum rmax/ these bounds rmin^ ^maK given by 

the significance and linearity, respectively, of the 
local model- The local regression line is denoted by 
18' . 

30 

The local regression model obtained is valid for all 
the data records that lie in the receptive region of 
the respective node 1; the best accuracy of prognosis 
for new data records consists in general in the center 
35 of the region, which are those H data records that are 
situated closest in Euclidian terms to the representing 
vector frij^ (that is to say those that ^'belong" to the 

node 1). It holds for this that: 
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argmin 




The local regression models can be calculated, in turn, 
on the basis of the local covariance matrices C*"*"* 

5 



The receptive regions can preferably also be formed 
with Gaussian weightings, the result of this being 
weighted mean values, variances and degrees of freedom. 
These details are ignored below for the sake of 
15 simplicity. 

The SOM representation can now be used to determine the 
local regression (in accordance with step 18 in figure 
3) for each set of given receptive radii ri relating to 
20 the nodes 1, with 1 = 1...N. The following squares of 
sums known per se can be formed in this case: 



4" -tHSH -^,) 



to the local residues: 



10 




total sum of squares of 
the global residue in 
the receptive region; 




mean value of the 



global residue within 
r(l), also termed 
offset; 
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e^^'^ ^ f^f/lv^ squares of the 

global residue relative 



to the local mean 
value; 



2^/)* ^i^fy ^^^^ total explained sum of 

^ squares in the local 



residue; 
sum of squares 



(^i^ — ^^'^^^ explained at the local 



l~ *ti^^^^ explained by the 



regression; 

sum of squares 
explain* 
offset; 



unexplained sum of 
^2{/> ^ ^ squares, residue of 2nd 



4 --Lk 



order , 



It holds for the unbiased estimator of the explained 
sums of squares (compare Kmenta, J. ^^Elements of 
Econometrics", 2nd edition, 1997, University of 
5 Michigan Press, Ann Arbor) that: 

"Kr^i ■ 

Ji is the number of the regressors for the respective 
10 local regression with the receptive radius ri about the 
node 1. In order for the regression to significantly 
explain a fraction of the total sum of squares of the 
residue, an overall test for the test variable F* known 
per se must be fulfilled as follows: 
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„,_ SI + Sl K-J-l _ for each K = Ki, J = Ji 



A complete set of local regressions over the SOM 
representation to the residue u is denoted below as 
5 overall model (of the local regressions) . 

The nonlinear corrected measure of determination R^nl^ 
which is composed of the contributions of the weighted, 
estimated explained variances of the individual local 
10 regressions as follows: 



&2 



Nt - 
^0 



can be regarded as deciding variable for the 
explanatory power of the overall model. 

The summing up of the local contributions to form a 
15 total value is preferably performed by weighting with 
the number of the data records Hi that are assigned to 
the respective node 1, for example 




20 Essential factors on which the explanatory power of the 
overall model depend are: 

a) the determination of optimal receptive radii ri 
for the local regressions; 
25 b) the determination of an SOM data representation 

that effectively resolves the nonlinear relationships; 
c) the combination of a) and b) so as to maximize the 
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explanatory power of the overall model. 

The accuracy of prognosis of the overall model depends 
(for a fixed, prescribed SOM data presentation) 
5 substantially on the selection of the receptive radii 
ri- In accordance with step 19 in figure 3, optimal 
receptive radii ri are now determined for all nodes 1, 
as a result of which the desired local prognosis models 
are then obtained in accordance with step 20 for all 
10 the nodes for the optimal receptive radii ri. 

The optimal values ropt for the receptive radii ri can 
preferably be determined by maximizing the value of R^^ 
together with simultaneous variation of all the 
15 receptive radii ri = n , compare also the illustration 
in figure 6, where the maximum is shown in a typical 
curve of R^^ t?ci^ radius ropt- 

As an alternative to this, ri can also be determined 
20 individually for each node 1 by minimizing the 

estimated error a^i^^g^ in the region of a testing set 

about the node 1. By way of example, again, this 
alternative is shown in the schematic of figure 7, 
where a minimum for the radius r^"^^ is illustrated in a 
25 typical curve profile a^j^^g^ . 

For the determination of the respective receptive 
radius r^^^^ , this alternative requires prior 
determination of a testing set of radius r^^®^^ about the 

30 respective node 1 that is large enough to estimate the 
error in the region of the node 1 as significant. It 
is preferably required for this purpose that a local, 
significant regression model can be formed to the 
residue u on the basis of this set itself, and the 

35 relative error in the estimate of the explained 

variance a for this set does not exceed a prescribed 
extent (so-called overf itting test) . 
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10 



15 



20 



25 



30 



An unbiased estimator for the error of the regression 
in the region of a ^^central" testing set is: 



The local prognosis models thus formed in r°^^ lead to a 
particularly good explanatory power of the overall 
model . 

Furthermore, the explanatory power of the overall model 
depends substantially on how well it is possible to 
distinguish the nonlinear influence of all the 
individual variables Xj on the target variable y (or on 
the residue u) for the local regressions in the data 
representation by the SOM. The task now is therefore to 
determine an advantageous SOM data representation. 

The targeted variation of the internal scalings Oj 
(compare also step 21 in figure 3, with the iteration 
feedback loop 22) can be used to influence the data 
representation such that those variables that make 
large contributions to R^^ more strongly ^^ordered" 

by the SOM, and their nonlinear influence on R^^ 
becomes capable of being more effectively calculated, 
and therefore of being optimized. 

This requires - at least approximately - that the 
following be known: 

a) how the nonlinearly explained variance, that is to 
say the nonlinear corrected measure of determination 
R^L f is determined by individual variables, compare 

also step 23 in figure 3; 

b) how the order of the variables Xj in the SOM 
affects the variants that can be explained by the 
variables Xj; compare step 23 in figure 3; and 

c) how the order of the variables Xj depends on the 
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internal scalings Ox (compare step 24 in figure 3) . 

The assignment of the explained variance Sg (more 

precisely: the explained sum of squares) of a linear 
5 regression to individual variables is preferably 

performed by the following decomposition. It is assumed 
that the explained sum of squares of the parent 
population is 

10 

By decomposing the covariance matrix C = (compare 
above) , it is possible for the explained sum of squares 
Sg^ to be divided into a symmetrical sum of squares by 

component : 

The summands Sg^ can be regarded as correlation- 
adjusted contributions of the variables Xj to the 
explained variance Sg^ . An unbiased estimator for the 

20 summands Sg^^ is 

with the definition dj : = (B-C o^-B)jj- 

If the regression was formed over a subset of the 
indices j = 1 ... J of the variables Xj, j = 1 ... L, Cq^ is 

25 that matrix which results from inversion of that 

subregion of the covariance matrix C which corresponds 
to those variables Xj, j = 1..J accepted into the 
regression, supplemented by zero entries in those 
sectors which correspond to the unaccepted variables. 

30 

On the basis of the correlation with the accepted 
variables, it then also holds for the variables not 
accepted into the regression that Sg^^^O, in general. 
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10 



15 



20 



The contribution of a variable Xj to the explained 
variance of the overall model by a weighted sum is 
determined as follows for a given set of local 
regressions : 



^ ' otherwise 

for the positive fraction of the explained variance in 
the overall model yields 



as identification number for the relative influence Ij 
of the variables Xj on the explained variance of the 
overall model. 

The nonlinear measure of determination likewise 
be assigned, with the relative influence Ij, to the 
individual variables Xj, specifically in accordance 
with the relationship 



This decomposition is preferably used to describe the 
contributions of individual variables to the nonlinear 
measure of determination of an overall model formed 
from a set of local regressions. 




Defining 
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As already mentioned and now explained below, the 
explainable variance is dependent on the order of the 
SOM. 

5 In order to simplify the description, it will be 
assumed below that the data distribution has been 
transformed into the space of the principal components, 
or that it holds in terms equivalent thereto that: 

10 

The loss of information by the lack of order of the 
data representation of the SOM with regard to the 
variables Xj can be expressed by the average range 
(compare above) . The relationship between the loss of 
15 explainable variance and the range Xj can be 

approximated empirically by a loss function D(Aj) in 

accordance with the following relationship: 

20 Those variables Xj that have a strong influence on the 
explained variance of the target variable y or on the 
residue u are more strongly weighted in the present 
method, that is to say are provided with a larger 
scaling factor such that the nonlinear dependence of 

25 variables Xj is more effectively taken into account, 
and thus the nonlinear measure of determination R^^ 

be maximized. 

It is assumed for the investigation now following of 
30 the dependence of the average range of the internal 

scalings for the SOM that the internal scalings Oq of 
the transformed data distribution are present in 
accordance with the relationship Xj^^ = ^iq •^k,i • 

35 In the principal component space, the ranges A,q depend 
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10 



in simplest approximation on Qq in a form that can be 
heuristically approximated by the following functional 
relationship: 

( 



This relationship {Oq) is sufficiently accurate to 
enable an iterative maximization (see loop 22 in figure 
3) of the nonlinear measure of determination R^^ 

varying the internal scalings Oq. 



The steps explained above for determining an 
advantageous data representation are now combined with 
the optimization of the local receptive regions such 
that the nonlinear explained variance in the residue is 
15 maximized, that is to say the accuracy of prognosis of 
the overall model is optimized, as will now be 
explained in more detail. 

It will be assumed below for the purpose of 
20 simplification that the data distribution has again 
been transformed into principal components. The 
approximate precondition that the loss functions D(Xq) 

are independent of one another use the following for 
the variance fraction that can be explained to a 
25 maximum extent by the variable Xq: 



Given a change in the internal scalings Oq — > , the 

consequence of this is a relative change \\f in the 
30 explained variance in the overall model, that is to say 



in accordance with: 
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R^L maximized iteratively or explicitly by 

varying a'q. This is preferably performed by parametric 
5 approximation of the condition (see block 21 in figure 



v|/ (a' i,...a' q) — >max, on the basis of the partial 
derivatives 




(so-called hill climbing), from which there 



10 follows a new set of A.'q and, from this, a set of 
scalings a'q. These have the form 



These new scalings lead to a new SOM representation of 
15 the data that more effectively resolves the 

nonlinearities in the relationship y(Xq) than on the 
basis of the scalings in the previous iteration step. 

Repeated application of the rescalings Qq — > a'q (loop 
20 22 in figure 3) thus delivers a successive improvement 
of the data representation in which the accuracy of 
prognosis of the overall model is maximized by the 
optimization of the receptive ranges. 

25 The optimized prognosis models and characteristics 

obtained are preferably also visualized, compare block 
25 in figure 3, in order to permit additional 
validation of the overall model - 

30 In accordance with block 26 in figure 3, the optimized 
prognosis models obtained in this way for all the nodes 
are applied in an appropriate way to new data (see 
block 27 in figure 3) in order thus to attain an 
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optimized prognosis (block 28). In this case, the local 
prognosis model of that node is respectively applied to 
the respective new data record whose representative is 
closest to the data record (compare above) . 

5 

The sequence described above in general is explained in 
more detail below in a concrete exemplary application 
for controlling a continuous steel casting - having the 
variables (xi to X3) : temperature T (strand shell) , 

10 strand removal rate V and alloying constituent 

concentration K (for chromium) the target variable 
being a specific steel quality measure, that is to say 
the tensile strength of the steel, for example. The 
steel production process is optimized in this case by 

15 the routine prognosis of the steel quality (the tensile 
strength) • The predicted quality is used to vary the 
control parameters (the removal rate V in this case) 
continuously such that the actual tensile strength 
reaches the required level or quality. 

20 

It is assumed for the purpose of simplification that in 
this method only the three named control variables V, K 
and T of the process state determine the steel quality: 

25 In this example, 26,014 data records were collected in 
the course of a production process as historical data 
for the generation of models- The individual variables 
having the mean values 

30 V = 0.291 m/s 

K = 2.23% Cr 
T = 540°C 

were standardized in each case in the data conditioning 
35 to a mean value = 0 and a variance = 1 and further 
processed in this form. 

The local regression models calculated and optimized 
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can be divided between individual associated, ''local"' 
control units 30.1...30.n, as shown in figure 8; the 
calculation of the prognosis values can take place in 
this case in the local control units 30.1,..30.n and 
5 serves the purpose of controlling associated, connected 
process units 31.1 - 31. n. However, it is also possible 
to manage the overall model centrally and to calculate 
the prognosis values for the local control units 
30-1...30.n centrally and subsequently distribute them as 
10 appropriate - 

Also illustrated- in figure 8, in addition, at 32 is a 
database for the process data that are conditioned in a 
data compression and representation unit 33 for the SOM 

15 representation. Illustrated at 3 in figure 8 is the 

prediction unit, which has already been explained with 
the aid of figure 1 and which is arranged upstream of 
the previously mentioned control units 30.1, 30. 2.,. 30. n. 
Connected to the latter are the process units 31.1, 

20 31.2...31.n, which finally lead to a process system unit 
34. 

The components 32, 33 can be denoted as a device for 
data retentions, whereas the units 3 and 30.1, 
25 30. 2... 30. n define a control system 36, and the process 
units 31.1, 31.2...31.n as well as the process system 
unit 34 define an operative system 37. 

The present method will now be run through by way of 
30 example below with the aid of the steel casting sample 
addressed, which has the variables of concentration K, 
rate V and temperature T as well as the target variable 
of tensile strength. The aim in this case is to 
optimize the tensile strength by optimal setting of V 
35 on the basis of predicting the tensile strength as 
accurately as possible and selectively. 

A complete, global regression of the tensile strength 
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to all three variables K, V and T is initially formed 
in a first step of the method- This regression has a 
corrected measure of determination of 0.414, that is to 
say 41.4% of the total scatter can be explained by the 
5 global regression. Thereupon, the internal scalings aj 
for compensating correlations were used to calculate an 
SOM that is to be seen in a somewhat simplified 
illustration in figure 9A (for the variable V = strand 
removal rate) ; figure 9B (for the variable K = 

10 concentration of Cr) ; and figure 9C (for the variable T 
= strand temperature upon removal) . The simplification 
was undertaken, in particular, because the power of a 
color-coded representation of values was relinquished; 
instead of this, a five-step black/white representation 

15 was selected, white representing the lowest value, 

dotted areas the next lower etc, and black being the 
area filling for regions with the highest values - 

In the illustration of figure 9, in particular in 
20 figure 9A (for the variable V, that is to say the 

removal rate) , it is to be seen that the values are 
relatively strongly scattered over the entire region, 
that is to say are moderately well ordered. 

25 Furthermore, in the illustrations of figure 9 (compare 
figure 9A in particular) one of the nodes has been 
depicted - at 1 - together with a receptive region for 
the purpose of better understanding, there also being 
plotted in figure 9A an associated receptive radius r 

30 that defines the (circular) receptive region. 

A consideration of the nonlinear influence of the 
individual variables for this representation indicates 
that the nonlinear influence of the removal rate V is 
35 greatest by comparison with the other variables. The 
following nonlinear influences Ij, where j = V, K, T, 
are calculated as follows for the individual variables: 
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Iv = 0,687, Ik, = 0.210, It = 0.103. 

R^L 0.238 is yielded as nonlinear measure of 
5 determination R^^ "^^e first iteration. This value 
means that 23.8% of the variance remaining globally 
unexplained can still be explained by nonlinear (local) 
regressions . 

10 Those internal scalings that yield the nonlinearities 
and measures of order of this iteration step for an 
improved SOM representation: 
a'v = 1.634, a'k = 0.711, aS = 0.543, 

are derived therefrom (see the previous discussion) . 

15 

The SOM data representation of the closest iteration is 
parameterized with these new internal scalings, the 
result being SOM representations that are modified in 
relation to figure 9, to be precise in accordance with 

20 figure 10a for V, in accordance with figure lOB for K 

and in accordance with figure IOC for T. It may be seen 
from these new SOM representations that the order has 
been raised inside figure lOA (for the removal rate V), 
whereas, in particular, the order in figure IOC 

25 (temperature) has been reduced. This corresponds to the 
requirement of more effectively detecting 
nonlinearities by means of the SOM representation and 
of being able to use them in the local regressions. 

30 The internal scalings for the next iteration, whose 

result is illustrated in figures llA, IIB and IIC, are 
then calculated using the respective measures of 
nonlinearity and order as well as the nonlinear measure 
of determination R^^ • By way of example, in detail here 

35 the SOM representation of the variable V (removal rate) 

is shown in figure llA, the standardized local 
regression coefficient p^J"' for the removal rate is 

depicted in figure IIB against the tensile strength {= 
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target variable) , and the associated distribution of 
the optimal receptive radii for the local linear 
regression is illustrated in figure IIC over the 
totality of data. 

5 

As is to be seen from the illustration in figure llA, 
the order inside the SOM for the variable V is further 
increased in the last iteration step. 

10 Figure 12 shows the change in all the parameters K, V, 
T as well as R^^ over the three iteration steps Nos. 1, 

2 and 3 in a diagram. 

In detail, figure 12 includes the representation of the 
15 profile of the nonlinear influences for the individual 
variables K, V, as well as of the resulting 
parameter R^^ over the iteration steps 1, 2 and 3. 

After the 3rd step, the nonlinear measure of 
determination R^^ also be used to explain 34.7% of 

20 the remaining 58.6% of the globally unexplained 

variance as nonlinear, and so a total of 61.7% of the 
total scatter can now be explained. 

The prognosis model is used in the production process 
25 by assigning each new process data record to that node 
which corresponds to the respective regions of state 
and/or quality of the process. For each of these 
regions, there is now a dedicated prognosis model that 
selectively describes the relationship between the 
30 parameters and the target value. 

The assignment is performed in accordance with the 
smallest distance of the data record Xj from the node 
1, using 



The local prognosis model of this node is then applied 



35 



argmin 
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to the data record, and the predicted tensile strength 
is used to set the optimal removal rate. 

This prognosis, differentiated from the prior art, 
5 permits a more selective forecast of the tensile 

strength as a function of K, V and T in the respective 
local region of state. The application of the overall 
model to the new data in the course of the production 
process thus leads to an overall improvement in the 
10 quality of the steel product produced. 

The invention can, of course, be applied in a similar 
way to the most varied production processes, in 
particular also in the case of production lines as well 
15 as to automatic distribution systems and other 
operative systems . 



